Newest 'scikit-learn machine-learning' Questions

0votes

0answers

12views

Isolation Forest sample size

I am using sklearn's Isolation Forest as a model to detect anomalies. My dataset is relatively small, 50 records with only 2-3 features. To prevent any overfitting, what would you recommend to tune ...

Mar

85

asked Apr 21 at 18:28

4votes

1answer

54views

Unsupervised Isolation Forrest sklearn hyperparameters

I am using sklearn's IsolationForest for unsupervised anomaly detection task. According to the docs, https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.IsolationForest.html, there are ...

Mar

85

asked Apr 17 at 17:33

-1votes

0answers

37views

ML model for Career Prediction

I am NOT able to figure out how to make a ML model. I have been chatgpting most of it and understanding the code, I'm doing next to nothing. No matter what code I input, the accuracy is always 0%... ...

Ananya Vijay

7

asked Apr 9 at 10:27

2votes

1answer

44views

I can't get my R² above 70%

I tried RandomForest, LGBM, Knneighbors, Polynomial Regression as algorithm's and cross-validation, train test split and standard scaler, nothing seem's to get it past the 70% mark. The dataframe has ...

user178825

23

asked Mar 7 at 0:22

1vote

1answer

45views

RFECV and grid search - what sets to use for hyperparameter tuning?

I am running machine learning models (all with sci-kit learn estimators, no neural networks) using a custom dataset with a number of features and binomial output. I first split the dataset into 0.6 (...

Alex

11

asked Jan 16 at 23:19

1vote

1answer

48views

Manual Python Implementation of Stacking Model

I tried to build a Python class, CustomStackingClassifier(), to implement the Stacking method in ensemble machine learning. In this implementation, the output of the base classifiers is set to be the ...

CM_Li

13

asked Jan 2 at 1:15

3votes

1answer

81views

Comparing clusterings from different datasets

I have 2 different data sets with essentially the same variables, though one is data from one year and the other is data from another year. I've run KModes on both data sets and now have some ...

ethqnol

31

asked Dec 20, 2024 at 23:41

2votes

2answers

142views

Random Forest always predicting the majority class

I'm predicting disease outcome using biological data (metabolites plus covariates age, sex and BMI). The outcome is a binary variable and moderately imbalanced (~12% positive cases). I have a ...

be_nice

31

asked Nov 6, 2024 at 18:41

0votes

0answers

34views

Is it possible to compute Davies Bouldin score from a precomputed distance matrix using sklearn?

I'm trying to compute the Davies Bouldin score to compare different clustering approach. I have a precomputed distance matrix (that represents edit-based distance between texts). I'm using the scikit-...

Tim

1

asked Oct 25, 2024 at 19:36

0votes

1answer

79views

As an intermediate R programmer looking to dive into machine learning, should I choose Python or stick with R?

Background I am an intermediate R programmer with some experience in machine learning concepts and simple modeling in R. I have an opportunity to collaborate with a professional machine learning team ...

a.sa.5969

11

asked Sep 13, 2024 at 21:38

0votes

0answers

271views

Correct method to report Randomized Search CV results

I have searched online but I still cannot find a definitive answer on how to "correctly" report the results from hyperparameter tuning a machine learning model; though, this may just be some ...

user167433

173

asked Aug 21, 2024 at 18:54

-1votes

1answer

58views

label encoding & one hot encoding

I have read somewhere that label encoding is only used for target variable and then for the input features we can use one hot encoding (nominal ) and ordinal encoding( features having order). I am ...

Sofia Malik

1

asked Aug 3, 2024 at 19:41

0votes

0answers

11views

Implementation of multi-classification meta-estimators in scikit-learn

In scikit-learn we have different methods to deal with multi-classification problems, below are some of the meta estimators used a. OneVsRestClassifier and ...

SOHAM SACHIN KULKARNI

1

asked Jul 28, 2024 at 10:23

4votes

2answers

177views

Loss function in Isolation Forest

I have recently came across on this algorithm and was working on my graduation project. As per my understanding, we creates sub trees for each sub samples. Then we calculates the scores for each ...

Mayank Singh

41

asked Jun 10, 2024 at 5:14

3votes

2answers

576views

How can I fit sklearn.svm.SVC with three features, given that the features are actually arrays of lengths 128, 12 and 40?

To clarify, each instance of feature_1 is a 128 item long array, each instance of feature_2 is a 12 item long array, and each instance of feature_3 is a 40 item long array. I am currently simply doing ...

Karn Varshneya

39

asked May 27, 2024 at 11:52

Stack Exchange Network

All Questions

Isolation Forest sample size

Unsupervised Isolation Forrest sklearn hyperparameters

ML model for Career Prediction

I can't get my R² above 70%

RFECV and grid search - what sets to use for hyperparameter tuning?

Manual Python Implementation of Stacking Model

Comparing clusterings from different datasets

Random Forest always predicting the majority class

Is it possible to compute Davies Bouldin score from a precomputed distance matrix using sklearn?

As an intermediate R programmer looking to dive into machine learning, should I choose Python or stick with R?

Correct method to report Randomized Search CV results

label encoding & one hot encoding

Implementation of multi-classification meta-estimators in scikit-learn

Loss function in Isolation Forest

How can I fit sklearn.svm.SVC with three features, given that the features are actually arrays of lengths 128, 12 and 40?

Hot Network Questions

All Questions

Related Tags